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Microcanonical equilibrium macrostates are characterized as the solutions of a constrained min- 
imization problem, while canonical equilibrium macrostates are characterized as the solutions of a 
related, unconstrained minimization problem. In the paper ^19*1 by Ellis, Haven, and Turkington, 
the problem of ensemble equivalence was completely solved at two separate, but related levels: the 
level of equilibrium macrostates, which focuses on relationships between the corresponding sets of 
equilibrium macrostates, and the thermodynamic level, which focuses on when the microcanonical 
entropy s can be expressed as the Legendre-Fenchel transform of the canonical free energy. A neat 
but not quite precise statement of the main result in lll9ll is that the microcanonical and canonical 
ensembles are equivalent at the level of equilibrium macrostates if and only if they are equivalent at 
the thermodynamic level, which is the case if and only if the microcanonical entropy s is concave. 

The present paper extends the results in fToll significantly by addressing the following motivational 
question. Given that the microcanonical ensemble is not equivalent with the canonical ensemble, is it 
possible to replace the canonical ensemble with a generalized canonical ensemble that is equivalent 
with the microcanonical ensemble? The generalized canonical ensemble that we consider is obtained 
from the standard canonical ensemble by adding an exponential factor involving a continuous func- 
tion g of the Hamiltonian. The special case in which g is quadratic plays a central role in the theory, 
giving rise to a generalized canonical ensemble known in the literature as the Gaussian ensemble. 

As in flQ*], we analyze the equivalence of the two ensembles at both the level of equilibrium 
macrostates and the thermodynamic level. A neat but not quite precise statement of the main result 
in the present paper is that the microcanonical and generalized canonical ensembles are equivalent 
at the level of equilibrium macrostates if and only if they are equivalent at the thermodynamic level, 
which is the case if and only if the generalized microcanonical entropy s — g is concave. The 
considerable freedom that one has in choosing g has the important consequence that even when 
the microcanonical and standard canonical ensembles are not equivalent, one can often find g with 
the property that the microcanonical and generalized canonical ensembles satisfy a strong form of 
equivalence which we call universal equivalence. For example, if the microcanonical entropy is C^, 
then universal equivalence of ensembles holds with g taken from a class of quadratic functions. This 
use of functions g to obtain ensemble equivalence is a counterpart to the use of penalty functions and 
augmented Lagrangians in global optimization. 

Keywords: Generalized canonical ensemble, equivalence of ensembles, microcanonical entropy, large deviation 
principle 



I. INTRODUCTION 

The problem of ensemble equivalence is a fundamental one lying at the foundations of equilibrium sta- 
tistical mechanics. When formulated in mathematical terms, it is apparent that this problem also addresses 
a fundamental issue in global optimization. Given a constrained minimization problem, under what condi- 
tions does there exist a related, unconstrained minimization problem having the same minimum points? 

In order to explain the connection between ensemble equivalence and global optimization and in order to 
outline the contributions of this paper, we introduce some notation. Let Af be a space, / a function mapping 
into [0, oo], and H a function mapping X into M^, where u is a positive integer. For u G M"^ we consider 
the following constrained minimization problem: 

minimize I{x) over x G X subject to the contraint H{x) = u. (1.1) 
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A partial answer to the question posed at the end of the first paragraph can be found by introducing the 
following related, unconstrained minimization problem for (3 € M'^: 

minimize I{x) + H{x)) over x ^ X, (1.2) 

where (•, •) denotes the Euclidean inner product on Ft" . The theory of Lagrange multipliers outlines suit- 
able conditions under which the solutions of the constrained problem (ll.lt lie among the critical points of 
/ + (/?, H). However, it does not give, as we will do in Theorems 13.11 and 13.41 necessary and sufficient con- 
ditions for the solutions of (ll.lt to coincide with the solutions of the unconstrained minimization problem 
(11.21) . By giving such necessary and sufficient conditions, we make contact with the duality theory of global 
optimization and the method of augmented Lagrangians S §2.2], H §6 .4]. In the context of global opti- 
mization the primal function and the dual function play the same roles that the (generalized) microcanonical 
entropy and the (generalized) canonical free energy play in statistical mechanics. Similarly, the replacement 
of the Lagrangian by the augmented Lagrangian in global optimization is paralleled by our replacement of 
the canonical ensemble by the generalized canonical ensemble. 

The two minimization problems dl.lt and (ll.2t arise in a natural way in the context of equilibrium 
statistical mechanics 1 19], where in the case a = \,u denotes the mean energy and (5 the inverse tempera- 
ture. We define <S" and £p to be the respective sets of points solving the constrained problem dl.ll) and the 
unconstrained problem (11.21) : i.e., 

£^ = {x ^ X : I{x) is minimized subject to H{x) = u} (1.3) 

and 

£ji = {x (^X : I{x) + H{x)) is minimized}. (1.4) 

For a given statistical mechanical model X represents the set of all possible equilibrium macrostates. As 
we will outline in Section 2, the theory of large deviations allows one to identify E"^ as the subset of X 
consisting of equilibrium macrostates for the microcanonical ensemble and as the subset consisting of 
equilibrium macrostates for the canonical ensemble. 

Defined by conditioning the Hamiltonian to have a fixed value, the microcanonical ensemble expresses 
the conservation of physical quantities such as the energy and is the more fundamental of the two ensembles. 
Among other reasons, the canonical ensemble was introduced by Gibbs 1 28] in the hope that in the limit 
n — > oo the two ensembles are equivalent; i.e., all asymptotic properties of the model obtained via the 
microcanonical ensemble could be realized as asymptotic properties obtained via the canonical ensemble. 
However, as numerous studies discussed near the end of this introduction have shown, in general this is not 
the case. There are many examples of statistical mechanical models for which nonequivalence of ensembles 
holds over a wide range of model parameters and for which physically interesting microcanonical equilibria 
are often omitted by the canonical ensemble. 

The paper [Ti'] investigates this question in detail, analyzing equivalence of ensembles in terms of rela- 
tionships between and £p. In turn, these relationships are expressed in terms of support and concavity 
properties of the microcanonical entropy 

s{u) = — inf{/(x) : x G X,H{x) = u}. 

The main results in are summarized in Theorem 13.11 which we now discuss under the simplifying 
assumption that dom s is an open subset of JR" . 

We focus on li G dom s. Part (a) of Theorem B. ll states that if s has a strictly supporting hyperplane at u, 
then full equivalence of ensembles holds in the sense that there exists a (3 such that £^ = Sp. In particular, 
if dom s is convex and open and s is strictly concave on dom s, then s has a strictly supporting hyperplane 
at all u [Thm. I3.3f a)] and thus full equivalence of ensembles holds at all u. In this case we say that the 
microcanonical and canonical ensembles are universally equivalent. 
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The most surprising result, given in part (c), is that if s does not have a supporting hyperplane at u, then 
nonequivalence of ensembles holds in the strong sense that n f/j = for all f3 E M'^. That is, if s 
does not have a supporting hyperplane at u — equivalently, if s is not concave at u — then microcanonical 
equilibrium macrostates cannot be realized canonically. This is to be contrasted with part (d), which states 
that for any x £ £p there exists u such that x G iS"; i.e., canonical equilibrium macrostates can always be 
realized microcanonically. Thus of the two ensembles the microcanonical is the richer 

The starting point of the present paper is the following motivational question suggested by Theorem 
13.11 Given that the microcanonical ensemble is not equivalent with the canonical ensemble on a subset of 
values of u, is it possible to replace the canonical ensemble with a generalized canonical ensemble that is 
univerally equivalent with the microcanonical ensemble; i.e., fully equivalent at all u? 

The generalized canonical ensemble that we consider is a natural perturbation of the standard canon- 
ical ensemble, obtained from it by adding an exponential factor involving a continuous function g of the 
Hamiltonian. The special case in which g is quadratic plays a central role in the theory, giving rise to a 
generalized canonical ensemble known in the literature as the Gaussian ensemble 11,13, EI. 3j, 33, 5oll . 
As these papers discuss, an important feature of Gaussian ensembles is that they allow one to account for 
ensemble-dependent effects in finite systems. Although not referred to by name, the Gaussian ensemble also 
plays a key role in fell , where it is used to address equivalence-of-ensemble questions for a point- vortex 
model of fluid turbulence. 

Let us focus on the case of quadratic g because it illustrates nicely why the answer to the motivational 
question is yes in a wide variety of circumstances. In order to simplify the notation, we work with u = 
and the corresponding set £^ of equilibrium macrostates. We denote by || • || the Euclidean norm on M'^ 
and consider the Gaussian ensemble defined in (12.61) with g{u) = 7|[n|p for 7 > 0. As we will outline in 
Section 2, the theory of large deviations allows one to identify the subset of X consisting of equilibrium 
macrostates for the Gaussian ensemble with the set 

£•(7)^ = |x € Af : I{x) + {p, H{x)) + -i\\H{x) f is minimized} . (1.5) 

£'(7)/3 can be viewed as an approximation to the set £^ of equilibrium macrostates for the microcanonical 
ensemble. This follows from the calculation 

x^X: lim {I{x) + {(i,H{x)) +-f\\H{ a;)|p| is minimized 

7— ►oo V / 

= {x £ X : I{x) is minimized subject to H{x) = 0} = 

This observation makes it plausible that there exist a /3 and a sufficiently large 7 such that £^ equals £{'y)i3; 
i.e., the microcanonical ensemble and the Gaussian ensemble are fully equivalent. As we will see, under 
suitable hypotheses this and much more are true. 

Our results apply to a much wider class of generalized canonical ensembles, of which the Gaussian 
ensemble is a special case. Given a continuous function g mapping M'^ into M, the associated set of 
equilibrium macrostates is defined as 

£(g)p = {x e X : I{x) + (/?, H{x)) + g{H{x)) is minimized}. 

This set reduces to (II. 5t when g{u) = 7||u|p. 

The utility of the generalized canonical ensemble rests on the simplicity with which the function q defin- 
ing this ensemble enters the formulation of ensemble equivalence. Essentially all the results in [19] con- 
cerning ensemble equivalence, including Theorem l3.1l generalize to the setting of the generalized canonical 
ensemble by replacing the microcanonical entropy s by the generalized microcanonical entropy s — g. The 
generalization of Theorem 13.11 is stated in Theorem 13.41 which gives all possible relationships between 
the set of equilibrium macrostates for the microcanonical ensemble and the set £{g)p of equilibrium 
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macrostates for the generalized canonical ensemble. These relationships are expressed in terms of support 
and concavity properties of s — g. The proof of Theorem 13.41 shows how easily it follows from Theorem 
13.11 in which all equivalence and nonequivalence relationships between and £f^ are expressed in terms 
of support and concavity properties of s. 

For the purpose of applications the most important consequence of Theorem 13.41 is given in part (a), 
which we now discuss under the simplifying assumption that dom s is an open subset of M'^. We focus on 
u G doms. Part (a) states that if s — g has a strictly supporting hyperplane at u, then full equivalence of 
ensembles holds in the sense that there exists a f3 such that E"^ = £{g)p. In particular, if dom s is convex and 
open and if s — 5 is strictly concave on dom s, then s — g has a strictly supporting hyperplane at all u [Thm. 
I3.6r a)1 and thus full equivalence of ensembles holds at all u. In this case we say that the microcanonical 
and generalized canonical ensembles are universally equivalent. 

The only requirement on the function g defining the generalized canonical ensemble is that g is con- 
tinuous. The considerable freedom that one has in choosing g makes it possible to define a generalized 
canonical ensemble that is universally equivalent with the microcanonical ensemble when the microcanon- 
ical and standard canonical ensembles are not equivalent on a subset of values of u. In Theorems I5.2M5.4I 
several examples of universal equivalence are derived under natural smoothness and boundedness condi- 
tions on s, while Theorem l5.5l derives a weaker form of universal equivalence under other conditions. In the 
first, second, and fourth of these theorems g is taken from a set of quadratic functions, and the associated 
ensembles are Gaussian. 

Theorem 15.21 which applies when the dimension a = 1, is particularly useful. It shows that if s is 
and s" is bounded above on the interior of dom s, then for any 

7 > ^ • sup s"{x), 

a;Gint(dom s) 

s{u) — is Strictly concave on doms. By part (b) of Theorem 13.61 and part (a) of Theorem 13.41 it fol- 
lows that the microcanonical ensemble and the Gaussian ensemble defined in terms of 7 are universally 
equivalent. The strict concavity of s{u) — ^v? also implies that the generalized canonical free energy is dif- 
ferentiable on ]R [Thm. I4.11 c)l. a condition guaranteeing the absence of a discontinuous, first-order phase 
transition with respect to the Gaussian ensemble. Theorem I5.3l is the analogue of Theorem 15.21 that treats 
arbitrary dimension cr > 2. Again, we prove that for all sufficiently large 7, the microcanonical ensem- 
ble and the Gaussian ensemble defined in terms of 7 are universally equivalent. These two theorems are 
particularly satisfying because they make rigorous the intuition underlying the introduction of the Gaussian 
ensemble: because it approximates the microcanonical ensemble in the limit 7 — > 00, universal ensemble 
equivalence should hold for all sufficiently large 7. 

The criterion in Theorem 15.21 that s" is bounded above on the interior of dom s is essentially optimal 
for the existence of a fixed quadratic function g guaranteeing the strict concavity of s — g on doms. The 
situation in which s" (u) —>■ 00 as u approaches a boundary point can often be handled by Theorem 15.51 
which is a local version of Theorem l5.2l 

Besides studying ensemble equivalence at the level of equilibrium macrostates, one can also analyze it 
at the thermodynamic level. This level focuses on Legendre-Fenchel-transform relationships involving the 
basic thermodynamic functions in the three ensembles: the microcanonical entropy s{u), on the one hand, 
and the canonical free energy and generalized canonical free energy, on the other. The analysis is carried 
out in Section Irvl where we also relate ensemble equivalence at the two levels. A neat but not quite precise 
statement of the main result proved in that section is that the microcanonical ensemble and the canonical 
ensemble (resp., generalized canonical ensemble) are equivalent at the level of equilibrium macrostates if 
and only if they are equivalent at the thermodynamic level, which is the case if and only s (resp., s — g) is 
concave. 

One of the seeds out of which the present paper germinated is the paper [EB], in which we study the 
equivalence of the microcanonical and canonical ensembles for statistical equilibrium models of coherent 
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structures in two-dimensional and quasi-geostrophic turbulence. Numerical computations demonstrate that 
nonequivalence of ensembles occurs over a wide range of model parameters and that physically interesting 
microcanonical equilibria are often omitted by the canonical ensemble. In addition, in Section 5 of |20], 
we establish the nonlinear stability of the steady mean flows corresponding to microcanonical equilibria 
via a new Lyapunov argument. The associated stability theorem refines the well-known Arnold stability 
theorems, which do not apply when the microcanonical and canonical ensembles are not equivalent. The 
Lyapunov functional appearing in this new stability theorem is defined in terms of a generalized thermody- 
namic potential similar in form to 

I{x) + {[i,H{x))+^\\H{x)\\\ 

the minimum points of which define the set of equilibrium macrostates for the Gaussian ensemble [see 
(I1.5t 1. Such Lyapunov functional arise in the study of constrained optimization problems, where they are 
known as augmented Lagrangians lE, 42|. 



Another seed out of which the present paper germinated is the work of Hetherington and coworkers 
on the Gaussian ensemble. Reference [31] is the first paper that defined the Gaussian 



ensemble as a modification of the canonical ensemble in which the standard exponential Boltzmann term 
involvin g th e energy is augmented by an additional term involving the square of the energy. As shown 
in |8|, |9l |32|, |50|], such a modified canonical ensemble arises when a sample system is in contact with a 
finite heat reservoir. From this point of view, the Gaussian ensemble can be viewed as an intermediate 
ensemble between the microcanonical, whose definition involves no reservoir, and the canonical ensemble, 
which is defined in terms of an infinite reservoir. The Gaussian ensemble is used in to study 

microcanonical-canonical discrepancies in finite-size systems; such discrepancies are generally present near 
first-order phase transitions. 

Gaussian ensembles are also considered in [32] and more or less implicitly in [33]. Reference [32] is a 
theoretical study of the Gaussian ensemble which derives it from the maximum entropy principle and stud- 
ies its stability properties. The second paper [33] uses some mathematical methods that are reminiscent of 
the Gaussian ensemble to study a point-vertex model of fluid turbulence. By sending 7 ^ oo after the fluid 
limit n ^ 00, the authors recover the special class of nonlinear, stationary Euler flows that is expected from 
the microcanonical ensemble. Their use of Gaussian ensembles improves previous studies in which either 
the logarithmic singularities of the Hamiltonian must be regularized or equivalence of ensembles must be 
assumed. As they point out, the latter is not a satisfactory assumption because the ensembles are nonequiva- 
lent in certain geometries in which conditionally stable configurations exist in the microcanonical ensemble 
but not in the canonical ensemble. Their paper motivated in part the analysis of ensemble equivalence in the 
present paper, which focuses on generalized canonical ensembles with a fixed function g and, as a special 
case, Gaussian ensembles in which 7 is fixed and is not sent to 00. 

In addition to the connections with [8^ ^ J3^ the present paper also builds on the wide literature 
concerning equivalence of ensembles in statistical mechanics. An overview of this literature is given in the 
introduction of Eoll . A number of papers on this topic, includin g lfl5 . I9I 2^27, 3^ ^ i^], investigate 



equivalence of ensembles using the theory of large deviations. In py, §7] and §7.3] there is a discussion 
of nonequivalence of ensembles for the simplest mean-field model in statistical mechanics; namely, the 
Curie-Weiss model of a ferromagnet. However, despite the mathematical sophistication of these and other 
studies, none of them except for our paper explicitly addresses the general issue of the nonequivalence 
of ensembles, which seems to be the typical behavior for a wide class of models arising in various areas of 
statistical mechanics. 

Nonequivalence of ensembles at the thermodynamic level has been observed in a number of long-range, 
mean-field spin models, including the Hamiltonian mean-field model fl3ll37ll . the mean-field X-Y model 
and the mean-field Blume-Emery-Griffith model In ^23] ensemble nonequivalence for the mean 

field Blume-Emery-Griffiths model was demonstrated to hold also at the level of equilibrium macrostates via 
numerical computations. For a mean-field version of the Potts model called the Curie-Weiss-Potts model. 
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equivalence and nonequivalence of ensembles at the level of equilibrium macrostates is analyzed in detail 
m Ensemble nonequivalence has also been observed in models of turbulence 

models of plasmas 1.36. 491. gravitational systems L29-.30. 4L 5 11 . and a model of the Lennard- Jones gas 
B. Many of these models can also be analyzed by the methods of 01 and the present paper. A detailed 
discussion of ensemble nonequivalence for models of turbulence is given in [19, §1.4]. 

The study of ensemble equivalence at the level of equilibrium macrostates involves relationships among 
the sets Ep, and £{g)p of equilibrium macrostates for the three ensembles. These sets are subsets 
of X, which in many cases, including short-range spin models and models of turbulence, is an infinite 
dimensional space. The most important discovery in our work on this topic is that all relationships among 
these possibly infinite dimensional sets are completely determined by support and concavity properties of 
the finite-dimensional, and in many applications, one-dimensional functions s and s — g. The main tools for 
analyzing ensemble equivalence are the theory of large deviations and the theory of concave functions, both 
of which exhibit an analogous conceptual structure. On the one hand, the two theories provide powerful, 
investigative methodologies in which formal manipulations or geometric intuition can lead one to the correct 
answer. On the other hand, both theories are fraught with numerous technicalities which, if emphasized, 
can obscure the big picture. In the present paper we emphasize the big picture by relegating a number of 
technicalities to the appendix. The reference treats in greater detail some of the material in the present 
paper including background on concave functions. 

In Section |ffl] of this paper, we state the hypotheses on the statistical mechanical models to which the 
theory of the present paper applies, give a number of examples of such models, and then present the results 
on ensemble equivalence at the level of equilibrium macrostates for the three ensembles. In Section |^ 
we relate ensemble equivalence at the level of equilibrium macrostates and at the thermodynamic level 
via the Legendre-Fenchel transform and a mild generalization suitable for treating quantities arising in 
the generalized canonical ensemble. In Section 4 we present a number of results giving conditions for the 
existence of a generalized canonical ensemble that is universally equivalent to the microcanonical ensemble. 
In all but one of these results the generalized canonical ensemble is Gaussian. The appendix contains a 
number of technical results on concave functions needed in the main body of the paper. 

II. DEFINITIONS OF MODELS AND ENSEMBLES 

The main contribution of this paper is that when the canonical ensemble is nonequivalent to the micro- 
canonical ensemble on a subset of values of u, it can often be replaced by a generalized canonical ensemble 
that is equivalent to the microcanonical ensemble at all u. Before introducing the various ensembles as well 
as the methodology for proving this result, we first specify the class of statistical mechanical models under 
consideration. The models are defined in terms of the following quantities. 

• A sequence of probability spaces {Qn,^n,Pn) indexed by n € IN, which typically represents a 
sequence of finite dimensional systems. The r2„ are the configuration spaces, tj € l^n are the 
microstates, and the P„ are the prior measures. 

• A sequence of positive scaling constant a„ — oo as n — > cxd. In general equals the total number 
of degrees of freedom in the model. In many cases a„ equals the number of particles. 

• A positive integer a and for each n ^ IN measurable functions -ff„,i, . . . , Hn^a mapping Q.^ into JR. 
For u; S fin we define 

1 

/in,i(w) = — Hn.i{i^) and /i„(a;) = . . . , /i„ ^(w)). 

a-n 

The Hn,i include the Hamiltonian and, if a > 2, other dynamical invariants associated with the 
model. 
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A large deviation analysis of the general model is possible provided that we can find, as specified in 
the next four items, a space of macrostates, macroscopic variables, and interaction representation functions 
and provided that the macroscopic variables satisfy the large deviation principle (LDP) on the space of 
macrostates. 



1. Space of macrostates. This is a complete, separable metric space X, which represents the set of all 
possible macrostates. 

2. Macroscopic variables. These are a sequence of random variables mapping Q,n into X. These 
functions associate a macrostate in X with each microstate lo ^ Qn. 

3. Interaction representation functions. These are bounded, continuous functions Hi, ... , H^j map- 
ping X into ]R such that as n ^ oo 

hn.i{^) = Hi{Yn{uj)) + o(l) Uniformly for u G n„; (2.1) 

i.e.. 



lim sup \hn,i{uj) - Hi{Yn{uj))\ = 0. 



n— >oo 

We define H = {Hi, . . . ,Ha)- The functions Hi enable us to write the /i„ j, either exactly or 
asymptotically, as functions of the macrostate via the macroscopic variables y„. 

4. LDP for the macroscopic variables. There exists a function / mapping X into [0, oo] and having 
compact level sets such that with respect to P„ the sequence Yn satisfies the LDP on X with rate 
function I and scaling constants a„ In other words, for any closed subset F of X 

limsup — logPn{y„ € F} < — inf I{x), 

n^oo an 

and for any open subset G of X 

liminf — logP„{y„ G G} > - inf lix). 

n^oo an ^GG 

It is helpful to summarize the LDP by the formal notation Pn{Yn € dx} x exp[— a„/(x)]. This 
notation expresses the fact that, to a first degree of approximation, Pn{Yn G dx} behaves like an 
exponential that decays to whenever I{x) > 0. 

As specified in item 3, the functions Hi are bounded on X, and because of (I2.lt the functions hn,i 
are also bounded on X. In [10] it is shown that all the results in this paper are valid under much weaker 
hypotheses on Hi, including H that are not bounded on X. 

The assumptions on the statistical mechanical models just stated as well as a number of definitions to 
follow are valid for lattice spin and other models. These assumptions differ slightly from those in il9L 
where they are adapted for applications to statistical mechanical models of coherent structures in turbu- 
lence. The major difference is that Hn in is replaced by hn here in several equations: the asymptotic 
relationship (I2.lt . the definition (12. 3t of the microcanonical ensemble P^'**, and the definition (I2.4t of the 
canonical ensemble Pn./s- In addition, in 1.19.1 the LDP for Yn is studied with respect to Pn,a„i3, in which f3 
is scaled by a^; here the LDP for Yn is studied with respect to Pn,i3- With only such superficial changes in 
notation, all the results in Il9ll are applicable here, and, in turn, all the results derived here are applicable to 
the models considered in 11911. 
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A wide variety of statistical mechanical models satisfy the hypotheses listed at the start of this section 
and so can be studied by the methods of [ 19] and the present paper. We next give six examples. The 
first two are long-range spin systems, the third a class of short-range spin systems, the fourth a model of 
two-dimensional turbulence, the fifth a model of quasi-geostrophic turbulence, and the sixth a model of 
dispersive wave turbulence. 

Example 2.1 . 

1. Mean-field Blume-Emery-Griffiths model. The Blume-Emery Griffiths model 0] is one of the few and 

certainly one of the simplest lattice-spin models known to exhibit, in the mean-field approximation, both a 
continuous, second-order phase transition and a discontinuous, first-order phase transition. This mean-field 
model is defined on the set {1,2,..., n}. The spin at site j G {1,2,..., n} is denoted by iOj, a quantity 
taking values in A = {—1,0, 1}. The configuration spaces for the model are il„ = A", the prior measures 
Pn are product measures on ftn with identical one-dimensional marginals p = ^{S^i + 60 + 61), and for 
Lo = {iOi, . . . ,iOn) £ f^n the Hamiltonian is given by 

U 

where is a fixed positive number. The space of macrostates for this model is the set of probability 
measures on A, the macroscopic variables are the empirical measures associated with the spin configurations 
LO, and the associated LDP is Sanov's Theorem, for which the rate function is the relative entropy with 
respect to p. The large deviation analysis of the model is given in |22], which also analyzes the phase 
transition in the model. Equivalence and nonequivalence of ensembles for this model is studied at the 
thermodynamic level in flLEl EsIl and at the level of equilibrium macrostates in H^ . 

2. Curie- Weiss-Potts model. The Curie-Weiss-Potts model is a long-range, mean-field approximation to 
the well known Potts model |53]. It is defined on the set {1,2,..., n}. The spin at site j € {1, 2, ... , n} is 
denoted by loj, a quantity taking values in the set A consisting of q distinct vectors 9^ G M'^, where q > 3 
is a fixed integer. The configuration spaces for the model are il„ = A", the prior measures P„ are product 
measures on il„ with identical one-dimensional marginals ^ X]?=i ^d'' ^^'^ for ^ — ("^i' • • • ) "^n) G f^n the 
Hamiltonian is given by 

1 " 




As in the case of the mean-field Blume-Emery-Griffiths model, the space of macrostates for the Curie- 
Weiss-Potts model is the set of probability measures on A, the macroscopic variables are the empirical 
measures associated with lo, and the associated LDP is Sanov's Theorem, for which the rate function is the 
relative entropy with respect to p. The large deviation analysis of the model is summarized in fllll . which 
together with [12] gives a complete analysis of ensemble equivalence and nonequivalence at the level of 
equilibrium macrostates. 

3. Short-range spin systems. Short-range spin systems such as the Ising model on Z,'^ and numerous 
generalizations can also be handled by the methods of this paper. The large deviation techniques required 
to analyze these models are much more subtle than in the case of the long-range, mean-field models con- 
sidered in items 1 and 2. The already complicated large deviation analysis of one-dimensional models is 
given in Section IV.7 of flTll . The even more sophisticated analysis of multi-dimensional models is car- 
ried out in 1I25, 44 1 . For these spin systems the space of macrostates is the space of translation-invariant 



probability measures on 'Z'^, the macroscopic variables are the empirical processes associated with the spin 
configurations, and the rate function in the associated LDP the mean relative entropy. 
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4. A model of two-dimensional turbulence. The Miller-Robert model is a model of coherent structures 
in an ideal, two-dimensional fluid that includes all the exact invariants of the vorticity transport equation 



^|45|]. In its original formulation, the infinite family of enstrophy integrals is imposed microcanonically 



along with the energy. If this formulation is slightly relaxed to include only finitely many enstrophy integrals, 
then the model can be put in the general form described above; that form can also be naturally extended to 
encompass complete enstrophy conservation. The space of macrostates is the space of Young measures on 
the vorticity field; that is, a macrostate has the form ^{x, dz), where x € A runs over the fluid domain A, z 
runs over the range of the vorticity field C,{x), and for almost all x, n{x, dz) is a probability measure in z. 
The large deviation analysis of this model developed first in lE^ and more recently in gives a rigorous 
derivation of maximum entropy principles governing the equilibrium behavior of the ideal fluid. 



5. A model of quasi-geostrophic turbulence. In later formulations, especially in geophysical applications, 
another version of the model in item 4 is preferred, in which the enstrophy integrals are treated canoni- 
cally and the energy and circulation are treated microcanonically (id]. In those formulations, the space of 
macrostates is L'^{A) or L°°(A) depending on the contraints on the voriticty field. The large deviation anal- 
ysis for such a formulation is carried out in 1 18]. Numerical results given in |20] illustrate key examples 
of nonequivalence with respect to the energy and circulation invariants. In addition, this paper shows how 
the nonlinear stability of the steady mean flows arising as equilibriums macrostates in these models can be 
established by utilizing the appropriate generalized thermodynamic potentials. 

6. A model of dispersive wave turbulence. A statistical equilibrium model of solitary wave structures 
in dispersive wave turbulence governed by a nonlinear Schrodinger equation is studied in |21]. In this 
model the energy is treated canonically while the particle number invariant is imposed microcanonically; 
without the microcanonical constraint on particle number the ensemble is not normalizable for focusing 
nonlinearities. The large deviation analysis given in i2lll derives rigorously the concentration phenomenon 
observed in long-time numerical simulations and predicted by mean-field approximations 1041 liill . The 
space of macrostates is i^(A), where A is a bounded interval or more generally a bounded domain in M"^. 



We now return to the general theory, first introducing the function whose support and concavity prop- 
erties completely determine all aspects of ensemble equivalence and nonequivalence. This function is the 
microcanonical entropy, defined for u G M'^ by 

s{u) = - inf{/(x) -.xeX, H{x) = u}. (2.2) 

Since / maps X into [0,oo], s maps M'^ into [— cxd,0]. Moreover, since / is lower semicontinuous and 
H is continuous on s is upper semicontinuous on M'^. We define dom s to be the set of u S M"^ for 
which s{u) > — oo. In general, dom s is nonempty since —s is a rate function 1 19, Prop. 3.1(a)]. For each 
u G dom s, r > 0, n £ ]N, and set B £ J^n the microcanonical ensemble is defined to be the conditioned 
measure 

= Pn{B I K G {u}^"-^}, (2.3) 

where {u}^^^ = [ui — r, ui + r] x ■ ■ ■ x [ua- — r,Ua + r]. As shown in [Ti', p. 1027], if u G dom s, then for 
all sufficiently large n, Pn{hn € {^Y^^} > 0; thus the conditioned measures P^^^ are well defined. 

A mathematically more tractable probability measure is the canonical ensemble. Let (•, •) denote the 
Euclidian inner product on M^. For each n G IN, (3 G , and set B £ J^n^^ define the partition function 

ZniP) = exp[-an{P,hn)]dPn, 
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which is well defined and finite, and the probability measure 

PnAB} = V7m • / exp[-a„(/3, K)] dP^. (2.4) 

The measures P„ ^ are Gibbs states that define the canonical ensemble for the given model. 

The generalized canonical ensemble is a natural perturbation of the canonical ensemble, defined in terms 
of a continuous function g mapping JR'^ into M. For each n £ IN and f3 € M'^ we define the generalized 
partition function 

Zn,g{P) = exp[-an{P,hn) - ang{K)]dPn. (2.5) 

This is well defined and finite because the /i„ are bounded and g is bounded on the range of the hn- For 
B £ TnV'/e also define the probability measure 

Pn,f3,g{^} = ry ^ ,a\ ' I exp[-a„(/3, K) - a„5(^n)] dPn, (2.6) 

Zn,g[l3) Jb 

which we call the generalized canonical ensemble. The special case in which g equals a quadratic function 
gives rise to the Gaussian ensemble I 

8, 9,1311,1321,13115^- 
In order to define the sets of equilibrium macrostates for each ensemble, we summarize two large devi- 
ation results proved in flill and extend one of them. It is proved in flil Thm. 3.2] that with respect to the 
microcanonical ensemble P^'^ , Yn satisfies the LDP on ^, in the double limit n ^ oo and r — > 0, with rate 
function 



^ 1 Hx) + s{u) if H{x) = u 
I oo otherwise. 



(2.7) 



is nonnegative on X, and for u G dom s, attains its infimum of on the set 

= {xeX : r(x) = 0} (2.8) 
= {x £ X : I{x) is minimized subject to H{x) = u}. 

In order to state the LDPs for the other two ensembles, we bring in the canonical free energy, defined 
for (3 £ M'^by 

ifiP) = - lim — logZ„(/5), 



n^oo a. 



n 



and the generalized canonical free energy, defined by 



V^giP) = - lim —\og Zn,g{P). 

n-^oo an 

Clearly ifoiP) = It is proved in Thm. 2.4] that the limit defining ip{(3) exists and is given by 

^iP)= inf{/(y) + (/3,F(y))} (2.9) 

and that with respect to Pn,f3, Yn satisfies the LDP on X with rate function 

Ip{x)=I{x) + {(3,H{x))-^{P). (2.10) 

Ifj is nonnegative on X and attains its infimum of on the set 

£(3 = {xeX : Ip{x) = 0} (2.11) 
= {x £ X : I{x) + {/3,H{x)) is minimized}. 
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A straightforward extension of these results shows that the limit defining ipg{P) exists and is given by 

^giP) = inf {/(y) + H{y)) + g{H{y))) (2.12) 

and that with respect to Pn,f3,g, Yn satisfies the LDP on X with rate function 

Ipjx) = I{x) + (/?, H{x)) + g{H{x)) - ifgiP). (2.13) 

Ifj^g is nonnegative on X and attains its infimum of on the set 

£{g)f3 = {x e X : Ip,g{x) = 0} (2.14) 
= {x e X : I{x) + (/?, H{x)) + g{H{x)) is minimized}. 

For u G dom s, let x be any element of X satisfying /"(x) > 0. The formal notation 

Pn'"{Yn G dx} X e-'^"^"^^) 

suggests that x has an exponentially small probability of being observed in the limit n ^ oo, r ^ 0. 
Hence it makes sense to identify S'^ with the set of microcanonical equilibrium macrostates. In the same 
way we identify with £p the set of canonical equilibrium macrostates and with £{g)f3 the set of generalized 
canonical equilibrium macrostates. A rigorous justification is given in Thm. 2.4(d)]. 



III. ENSEMBLE EQUIVALENCE AT THE LEVEL OF EQUILIBRIUM MACROSTATES 

Having defined the sets of equilibrium macrostates S'^, £j^, and £{g)f3 for the microcanonical, canonical 
and generalized canonical ensembles, we now come to the main point of this paper, which is to show how 
these sets relate to one another. In Theorem 13. II we state the results proved in lll9ll concerning equivalence 
and nonequivalence at the level of equilibrium macrostates for the microcanonical and canonical ensembles. 
Then in Theorem l3.4l we extend these results to the generalized canonical ensemble. 

Parts (a)-(c) of Theorem 13. II give necessary and sufficient conditions, in terms of support properties of 
s, for ensemble equivalence and nonequivalence of and S^. These assertions are proved in Theorems 4.4 
and 4.8 in lll9l fl. Part (a) states that s has a strictly supporting hyperplane at u if and only if full equivalence 
of ensembles holds; i.e., if and only if there exists a (3 such that = S/j. The most surprising result, given 
in part (c), is that s has no supporting hyperplane at u if and only if nonequivalence of ensembles holds in 
the strong sense that n <S/3 = for all f3 G M^. Part (c) is to be contrasted with part (d), which states 
that for any /? G IR"^ canonical equilibrium macrostates can always be realized microcanonically. Part (d) 
is proved in Theorem 4.6 in fioll . Thus one conclusion of this theorem is that at the level of equilibrium 
macrostates the microcanonical ensemble is the richer of the two ensembles. The concept of a relative 
boundary point, which arises in part (c), is defined after the statement of the theorem. For f3 G M'^, [(3, —1] 
denotes the vector in ]R"^^ whose first a components agree with those of (3 and whose last component 
equals —1. 

Theorem 3.1. In parts (a), (b), and (c), u denotes any point in dom s. 

(a) Full equivalence. There exists (3 G IR" such that f " = £p if and only if s has a strictly supporting 
hyperplane at u with normal vector [/?, —1]; i.e., 

s{v) < s{u) + {P,v — u) for all v ^ u. 

(b) Partial equivalence. There exists f3 G M"^ such that C £p but 6"^ ^ £p if and only if s has a 
nonstrictly supporting hyperplane at u with normal vector [/?, —1]; i.e., 

s{v) < s(u) + {P,v — u) for all v with equality for some v ^ u. 
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(c) Nonequivalence. For all (3 € M^, 8"^ £p = ^ if and only if s has no supporting hyperplane at u; 

i.e., 

for all (3 G IRf there exists v such that s{v) > s{u) + {(3,v — u). 

Except possibly for relative boundary points o/dom s, the latter condition is equivalent to the nonconcavity 
of s at u rThm. lA3l c)1. 

(d) Canonical is always realized microcanonically. For any (5 G IR"^ we have HiSp) C dom s and 

£,= [j £\ 

We highlight several features of the theorem in order to illuminate their physical content. In part (a) 
we assume that for a given u G dom s there exists a unique (3 such that £^ = Sp. If s is differentiable 
at u and s and the double-Legendre-Fenchel transform s** are equal in a neighborhood of u, then (3 is 
given by the standard thermodynamic formula (3 = Vs(n) [Thm. lA!4l b)1. The inverse relationship can 
be obtained from part (d) of the theorem under the assumption that £fj consists of a unique macrostate or 
more generally that for all x ^ £p the values H{x) are equal. Then f"^ = £'^'^^\ where u{(3) = H{x) 
for any x G £f3; u{j3) denotes the mean energy realized at equilibrium in the canonical ensemble. The 
relationship u = u{[3) inverts the relationship (3 = Vs{u). Partial ensemble equivalence can be seen in 
part (d) under the assumption that for a given (3, £p can be partitioned into at least two sets £p^i such that 
for all X G £p^i the values H{x) are equal but H{x) ^ H{y) whenever x G £p^i and y G <5/3j- for i ^ j. 
Then £(^ = Ui^"'^^\ where Ui{p) = H{x), x G £^^1. Clearly, for each i, £:"»(^) C £p but ^"»(^) / £p. 
Physically, this corresponds to a situation of coexisting phases that normally takes place at a first-order 
phase transition Ii521 . 

Theorem 4.10 in Hill states an alternative version of part (d) of Theorem 13. II in which the set H{£(f) 
of canonical equilibrium mean-energy values is replaced by another set. We next present a third version of 
part (d) that could be useful in applications. This corollary is also aesthetically pleasing because like parts 
(a)-(c) of Theorem B.ll it is formulated in terms of support properties of s. 

Corollary 3.2. For [3 G Si^ we define Ap to be the set ofu G dom s such that s has a supporting hyperplane 
at u with normal vector [/?, —1]. Then 

£p= [j £\ 

Proof. Part (d) of Theorem 13. 1 1 implies that if li G H{£is), then f " C f/j. From parts (a) and (b) of the 
theorem it follows that s has a supporting hyperplane at u with normal vector —1]. Hence H{£i3) C Ajp 
and 

= U ^" c U 

The reverse inclusion is also a consequence of parts (a) and (b) of the theorem, which imply that if tt G Ap, 
then £^ C <?/3 and thus that 

U ^" c £p. 

This completes the proof. ■ 
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Before continuing with our analysis of ensemble equivalence, we introduce several sets that play a 
central role in the theory. Let / ^ — oo be a function mapping IR'^ into ^ U {— oo}. The relative interior 
of dom/, denoted by ri(dom/), is defined as the interior of dom/ when considered as a subset of the 
smallest affine set that contains dom/. Clearly, if the smallest affine set that contains dom / is ]R'^ , then 
the relative interior of dom / equals the interior of dom /, which we denote by int(dom /). This is the case 
if, for example, a = I and dom / is a nonempty interval. The relative boundary of dom / is defined as 
cl(dom /) \ ri(dom / ). 

We continue by giving several definitions for concave functions on M'^ when a is an arbitrary positive 
integer. We then specialize to the case o" = 1, for which all the concepts can be easily visualized. Additional 
material on concave functions is contained in the appendix. Let / be a concave function on JR"^. For u G M'^ 
the superdifferential of / at u, denoted by df{u), is defined to be the set of /? G JR^ such that [/?, —1] is the 
normal vector to a supporting hyperplane of / at u; i.e., 

f{v) < f{u) + {P,v- u) for all v G IR" . 

Any such (3 is called a supergradient of / at u. The domain of df, denoted by dom df, is then defined to be 
the set of u for which df{u) 7^ 0. A basic fact is that dom df is a subset of dom / and differs from it, if at 
all, only in a subset of the relative boundary of dom /; a precise statement is given in part (a) of Theorem 
lA.ll By definition of dom df, it follows that / has a supporting hyperplane at all points of dom / except 
possibly relative boundary points. 

We now specialize to the case a = I, considering a concave function / mapping IR into iR U {— 00} for 
which dom / is a nonempty interval L. For u G L, df{u) is defined to be the set of /3 £ M such that f3 is 
the slope of a supporting line of / at u. Thus, if / is differentiable at u G int L, then df{u) consists of the 
unique point (3 = f'{u). If / is not differentiable at n G intL, then dom 5/ consists of all (5 satisfying the 
inequalities 

(/r(n) </?<(/')-(«), 

where {f')~{u) and {f')^{u) denote the left-hand and right-hand derivatives of / at u. 

Complications arise because dom df can be a proper subset of dom /, as the situation in one dimension 
clearly shows. Let 6 be a boundary point of dom / for which f{b) > — 00. Then h is in dome?/ if and only 
if the one-sided derivative of / at 6 is finite. For example, if 6 is a left hand boundary point of dom / and 
(/')^(6) is finite, then df{h) = [{f')'^{h), 00); any (3 G df{b) is the slope of a supporting line at h. The 
possible discrepancy between dom df and dom / introduces unavoidable technicalities in the statements of 
many results concerning the existence of supporting hyperplanes. 

One of our goals is to find concavity and support conditions on the microcanonical entropy guaranteeing 
that the microcanonical and canonical ensembles are fully equivalent at all points u G dom s except possibly 
relative boundary points. If this is the case, then we say that the ensembles are universally equivalent. Here 
is a basic result in that direction. 

Theorem 3.3. Assume that dom s is a convex subset of ]R^ and that s is strictly concave on ri(dom s) and 
continuous on dom s. The following conclusions hold. 

(a) s has a strictly supporting hyperplane at all u G dom s except possibly relative boundary points. 

(b) The microcanonical and canonical ensembles are universally equivalent; i.e., fully equivalent at all 
u G dom s except possibly relative boundary points. 

(c) s is concave on JR'^, and for each u in part (b) the corresponding (3 in the statement of full equivalence 
is any element of ds{u). 

(d) If s is differentiable at some u G dom s, then the corresponding (3 in part (b) is unique and is given 
by the standard thermodynamic formula [3 = V s{u). 
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Proof, (a) This is a consequence of part (c) of Theorem IA.4I 

(b) The universal equivalence follows from part (a) of Theorem l3.ll 

(c) By Proposition IA.3I the continuity of s on dom s allows us to extend the strict concavity of s on 
ri(dom s) to the concavity of s on dom s. Since s equals — oo on the complement of dom s, s is also concave 
on M'^. The second assertion in part (c) is the definition of supergradient. 

(d) This is a consequence of part (c) of the present theorem and part (b) of Theorem lA.il ■ 

We now come to the main result of this paper, which extends Theorem 13.1 1 bv giving equivalence and 
nonequi valence results involving and £(q)p- The proof of the theorem makes it transparent why s in 
Theorem 13. II is replaced here by s — g. In f 10] an independent proof of Theorem 13 .41 is derived from first 
principles rather than from Theorem l3.1l As we point out after the statement of Theorem l3.4l for the purpose 
of applications part (a) is its most important contribution. In order to illuminate its physical content, we note 
that if s — (7 is differentiable at some u G dom s and s — g = {s — g)** in a. neighborhood of u, then /3 is 
unique and is given by the thermodynamic formula /3 = V(s — g){u) rThm. lA!4T b')1. 

Theorem 3.4. Let g be a continuous function mapping M'^ into M, in terms of which the generalized 
canonical ensemble ( 12. 6t is defined. The following conclusions hold. In parts (a), (b), and (c), u denotes 
any point in dom s. 

(a) Full equivalence. There exists (3 € M"^ such that 8^ = £{g)j3 if and only if s — g has a strictly 
supporting hyperplane at u with normal vector —1]. 

(b) Partial equivalence. There exists [5 G IR'^ such that C £{g)p but ^ £p if and only if s — g 
has a nonstrictly supporting hyperplane at u with normal vector —1]. 

(c) Nonequivalence. For all (3 € M'^, n £{g)i3 = if and only if s — g has no supporting hyper- 
plane at u. Except possibly for relative boundary points of dom s, the latter condition is equivalent to the 
nonconcavity of s — g at u [Thm. Ia3I c)1. 

(d) Generalized canonical is always realized microcanonically. For any (5 G we have 
H{£{g)l3) C dom sand 

£{9)p = U 

Proof. For B £ J^n^^ define a new probability measure 

Pn,g{B} = f exp[-ang{K)]dPn. 

/ ex.p[-ang{hn)] dPn ^ 

Replacing the prior measure P„ in the standard canonical ensemble with P„ ^ gives the generalized canon- 
ical ensemble Pn,i3,g', i-e., 

Pn,l3,g{B} = —„ f exp[-an{l3,hn)]dPn,g. 

/ ex.p[-an{P,hn)]dPn,g •'^ 
We also introduce a new conditioned measure 

PZ{B} = Pn,g{B\K^{u}^'^}, 

obtained from the microcanonical ensemble P^'^ by replacing P„ with Pn.g- Since g is continuous, for uj 
in the set € {u}^"'^}, g{hn{uj)) converges to g{u) uniformly in uj and n as r ^ 0. It follows that with 
respect to P^'I, Yn satisfies the LDP on X, in the double limit n ^ oo and r — s- 0, with the same rate 
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function I" as in the LDP for Yn with respect to P,^'^. As a result, the set £{g)'" of equilibrium macrostates 
corresponding to P^'J coincides with the set <S" of microcanonical equilibrium macrostates. 

At this point we recall that according to Theorem 13. II all equivalence and nonequivalence relationships 
between and £p are expressed in terms of support properties of 

s{u) = — inf{/(x) : x G ?i,H{x) = u}, 

where / is the rate function in the LDP for Yn with respect to the prior measures P„. With respect to the 
new prior measures Pn,g, Yn satisfies the LDP on X with rate function 

Ig{x) = I{x) + g{H{x)) — const. 

It follows that all equivalence and nonequivalence relationships between £{g)^ and £{g)p are expressed in 
terms of support properties of the function Sg obtained from s by replacing the rate function / by the new 
rate function Ig. The function Sg is given by 

Sg{u) = — inf{/g(x) : X G Af, H{x) = u} 

= - inf{/(a;) + g{H{x)) : x e X, H{x) = u} + const 
= s{u) — g{u) + const. 

Since £{g)^ = £^ and since Sg differs from s — g by a. constant, we conclude that all equivalence and 
nonequivalence relationships between and £{g)p are expressed in terms of the same support properties 
of s — g. This completes the derivation of Theorem l3.4l from Theorem 13. II ■ 

The relationships between <S" and £{g)p in Theorem 13.41 are valid under much weaker assumptions on 
both g and Hi that guarantee that these sets are nonempty. For example, the continuity of g is not needed. 
Of course, if one does not have the LDPs for Yn with respect to P^'^' and Pn,i3,g, then one cannot interpret 

and £{g)f3 as sets of equilibrium macrostates for the two ensembles. A similar comment applies to 
Theorem 13. II 

The next corollary gives an alternative version of part (d) of Theorem 13 .41 It follows from the theorem 
in the same way that Corollarv l3.2l follows from Theorem l3.1l which is the analogue of Theorem l3.4l for the 
canonical ensemble. 

Corollary 3.5. Let g be a continuous function mapping ]R'^ into M, in terms of which the generalized 
canonical ensemble (12. 6t is defined. For (3 G M'^ we define A{g)/^ to be the set ofu G dom s such that s — g 
has a supporting hyperplane at u with normal vector [f3, —1]. Then 

£{9)p= U 

The importance of part (a) of Theorem 13.41 in applications is emphasized by the following theorem, 
which will be applied several times in the sequel. This theorem is the analogue of Theorem 13.31 for the 
generalized canonical ensemble, replacing s in that theorem with s — g. Since g takes values in ]R, the 
domain oi s — g equals the domain of s. Theorem l3.6l is proved exactly like Theorem 13. 3 1 

Theorem 3.6. Assume that dom s is a convex subset of M'^ and that s — g is strictly concave on ri(dom s) 
and continuous on dom s. The following conclusions hold. 

(a) s — g has a strictly supporting hyperplane at all u G dom s except possibly relative boundary points. 

(b) The microcanonical and generalized canonical ensembles are universally equivalent; i.e., fully equiv- 
alent at all u G dom s except possibly relative boundary points. 

(c) s — g is concave on ]R", and for each u in part (b) the corresponding (3 in the statement of full 
equivalence is any element ofd{s — g){u). 

(d) If s — g is differentiable at some u G dom s, then the corresponding fi in part (b) is unique and is 
given by the thermodynamic formula /3 = V(s — g){u). 
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The most important repercussion of Theorem 13.61 is the ease with which one can prove that the mi- 
crocanonical and generalized canonical ensembles are universally equivalent in those cases in which mi- 
crocanonical and standard canonical ensembles are not fully or partially equivalent. In order to achieve 
universal equivalence, one merely chooses g so that s — 5 is strictly concave on ri(dom s). One has con- 
siderable freedom doing this since the only requirement is that g be continuous. Section |V] is devoted to 
this and related issues. In Theorems 15 . 2M5 . 5 1 we will give several useful examples, three of which involve 
quadratic functions g. 

In the next section we introduce the thermodynamic level of ensemble equivalence and discuss its rela- 
tionship to ensemble equivalence at the level of equilibrium macrostates. 

IV. ENSEMBLE EQUIVALENCE AT THE THERMODYNAMIC LEVEL 

The thermodynamic level of ensemble equivalence is formulated in terms of the Legendre-Fenchel trans- 
form for concave, upper semicontinuous functions. Such transforms arise in a natural way via the varia- 
tional formula (I2.9t for the canonical free energy tp. Replacing the infimum over y € Af by the infimum 
over y ^ X satisfying H{y) = u followed by the infimum over u € JR" and using the definition (I2.2t of 
the microcanonical entropy s, we see that for all [3 € IR'^ 

^{13) = inf {(/?, u) + \ni{I{y) -yeX, H{y) = u}} 
= mf{{f3,u)-s{u)} = s*{(3). 

This calculation shows that ip, the basic thermodynamic function in the canonical ensemble, can always 
be expressed as the Legendre-Fenchel transform s* of s, the basic thermodynamic function in the micro- 
canonical ensemble. However, the converse need not be true. In fact, by the theory of Legendre-Fenchel 
transforms s{u) = ip*{u) for all u e M'^, or equivalently s(n) = s**{u) for all u, if and only if s is concave 
and upper semicontinuous on M'^. While the upper semicontinuity is automatic from the definition of s, 
the concavity does not hold in general. This state of affairs concerning cp and s makes it clear that the 
thermodynamic level reveals what we have already seen at the level of equilibrium macrostates; namely, of 
the two ensembles the microcanonical ensemble is the more fundamental. 

Similar considerations apply to the relationship between s and (pg, the generalized canonical free energy, 
defined in terms of a continuous function g mapping JR'^ into M. Making the same changes in the variational 
formula (12.121) for ipg as we just did in the variational formula for ip shows that for all (3 G JR'^ 

^giP) = inf UP, u) + g{u) + inf{/(y) : y E H{y) = u}} 
= -mi{{p,u)+g{u)-s{u)} 

As in the case when g = 0, this relationship can be inverted to give (s — g){u) = 'p*g{u) for all u G iR*^, or 
equivalently (s — g){u) = {s — g)**{u), if and only if s — (7 is concave on ]R^ . 

In order to be able to express these relationships in forms similar to those relating pn and s, we define for 
P and u in 

s\g,P)= \ni{{P,u)+g{u)-s{u)] = {s-gY{P) (4.1) 

and 

s^\g,u)=g{u)+ ml {(p^u) - s\g,P)] = g{u) + {s - gY*{u). (4.2) 
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Thus for all /?, ^Pgifi) = s^{g,f3) while for all u, s^^{g,u) = s{u) if and only if {s-g){u) = {s-g)**{u) = 
^*g{u), and this holds if and only if s — 5 is concave on ]R^ . 

The next theorem records these facts in parts (a) and (b). Part (c) introduces a new theme proved in 
Theorem 26.3 in EtIi . The strict concavity of s — 5 on dom s implies that Lpg is essentially smooth; i.e., (pg 
is differentiable on ]R" and 

lim ||V93„(/3„)|| = 00 whenever — > 00. 

n — >oo 

Setting 5 = implies a similar result relating s and ipQ = ip. The differentiability of <p{l3) or (pg{(3) implies 
that the corresponding ensemble does not exhibit a discontinuous, first-order phase transition. 

Theorem 4.1. Let g be a continuous function mapping M'^ into M, in terms of which the generalized 
canonical ensemble (12. 6t is defined. The choice g = gives the standard canonical ensemble (12. 4t . The 
following conclusions hold. 

{ii)Foralip£ M^, Pg{p) = sHg,P) = (s 

(b) For all u e 

s(n) = g{u) + {s- gy*{g, u) = g{u) + 

if and only if s — g is concave on ]R". Both of these are equivalent to [s — g){u) = [s — g)**{u) and to 
s{u) = s^''((7, u). 

(c) If dom s is convex and s — g is strictly concave on dom s, then ipg is essentially smooth; in particular, 
ipg is differentiable on IR". 

Theorem 14.11 is the basis for defining equivalence and nonequivalence of ensembles at the thermody- 
namic level. The microcanonical and canonical ensembles are said to be thermodynamically equivalent at 
u G doms if s{u) = s**{u) and to be thermodynamically nonequivalent at u if s{u) 7^ s**(n); the latter 
inequality holds if and only if s{u) < s**{u) [Prop. Ia!21 . Similarly, the microcanonical and generalized 
canonical ensembles are said to be thermodynamically equivalent at n if (s — g) (n) = (s — g)** (u) — equiv- 
alently, s(n) = s'^'^{g, u) — and to be thermodynamically nonequivalent at n if (s — g){u) < {s — g)**{u); 
the latter inequality holds if and only if (s — g){u) < {s — g)**{u) rProp. lA!2l . 

The relationship between ensemble equivalence at the thermodynamic level and at the level of equi- 
librium macrostates is formulated in the next theorem for the microcanonical and generalized canonical 
ensembles. Setting g = gives the corresponding relationships between ensemble equivalence at the two 
levels for the microcanonical and canonical ensembles. Ensemble equivalence at the thermodynamic level 
involves concavity properties of s — g while ensemble equivalence at the level of equilibrium macrostates 
involves support properties of s — g. Except possibly for relative boundary points, s — g is concave at 
u G dom s if and only if s — g has a supporting hyperplane at u. Hence if dom s is open and so contains no 
relative boundary points, then the relationship between the two levels of ensemble equivalence is elegantly 
symmetric. This is given in part (a). In part (b) we state the less symmetric relationship between the two 
levels when dom s is not open and so contains relative boundary points. 

Theorem 4.2. Let g be a continuous function mapping ]R" into M, in terms of which the generalized 
canonical ensemble (12. 6t is defined. The choice g = gives the standard canonical ensemble. The following 
conclusions hold. 

(a) Assume that dom s is an open subset of ]R^. Then the microncanonical and generalized canonical 
ensembles are thermodynamically equivalent at u ^ doms if and only if the ensembles are either fully or 
partially equivalent at u. 

(b) Assume that dom s is not an open subset of ]R". If the microcanonical and generalized canonical en- 
sembles are thermodynamically equivalent atu^ ri(dom s), then the ensembles are either fully or partially 
equivalent at u. Conversely, if the ensembles are either fully or partially equivalent at u ^ dom s, then the 
ensembles are thermodynamically equivalent at u. 
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Proof, (a) If doms is open, then since doms contains no relative boundary points, the sets doms and 
ri(dom s) coincide. Hence part (a) is a consequence of part (b). 

(b) If the ensembles are thermodynamically equivalent at u € ri(dom s), then {s — g) (u) = {s — g)** (n). 
Applying the first inclusion in part (b) of Theorem lA.5l to f = s — g, v/e, conclude the existence of (3 such 
that s has a supporting hyperplane at u with normal vector [f3, —1]. Parts (a) and (b) of Theorem 13 .41 then 
imply that the ensembles are either fully or partially equivalent at u. Conversely, if the ensembles are either 
fully or partially equivalent at n € doms, then by parts (a) and (b) of Theorem 13.41 there exists /? such 
that s has a supporting hyperplane at u with normal vector [/?, — 1]. Applying part (a) of Theorem IA.4I to 
/ = (s — g), we conclude that {s — g){u) = {s — g)**{u); i.e., the ensembles are thermodynamically 
equivalent at u. This completes the proof. ■ 

In the next section we isolate a number of scenarios arising in applications for which the microcanonical 
and generalized canonical ensembles are universally equivalent. This rests mainly on part (b) of Theorem 
13.61 which states that universal equivalence of ensembles holds if we can find a g such that s — g is strictly 
concave on ri(dom s). 

V. UNIVERSAL EQUIVALENCE VIA THE GENERALIZED CANONICAL ENSEMBLE 

This section addresses a basic foundational issue in statistical mechanics. In Theorems 15 . 2H5 . 5 [ we show 
that when the standard canonical ensemble is nonequivalent to the microcanonical ensemble on a subset of 
values of u, it can often be replaced by a generalized canonical ensemble that is univerally equivalent to the 
microcanonical ensemble. In three of these four theorems, the function g defining the generalized canonical 
ensemble is a quadratic function, and the ensemble is Gaussian. 

In these three theorems our strategy is to find a quadratic function g such that s — g is strictly concave 
on ri(dom s) and continuous on dom s. Part (b) of Theorem 13.61 then yields the universal equivalence. As 
the next proposition shows, an advantage of working with quadratic functions is that support properties of 
s — g involving a supporting hyperplane are equivalent to support properties of s involving a supporting 
paraboloid defined in terms of g. This observation gives a geometrically intuitive way to find a quadratic 
function g guaranteeing universal ensemble equivalence. 

In order to state the proposition, we need a definition. Let / be a function mapping M'^ into iRU {— oo}, 
u and f3 points in JR'^, and 7 > 0. We say that / has a supporting paraboloid at u G M'^ with parameters 

(At) if 

fiv) < fiu) + {f3,v-u) + j\\v - uf for all v G M"" . 
The paraboloid is said to be strictly supporting if the inequality is strict for all v ^ u. 

Proposition 5.1. / has a {strictly) supporting paraboloid at u with parameters ((3, 7) if and only 7II -p 
has a (strictly) supporting hyperplane at u with normal vector [/?, —1]. The quantities (3 and (3 are related 
by (3 = (3- 2-fu. 

Proof. The proof is based on the identity \\v — = ||f |p — 2{u,v — u) — If / has a strictly 

supporting paraboloid at u with parameters (/?, 7), then for all u 7^ n 

f{v) - j\\vf < f{u) - 7||^i||^ + {(3,v - u), 

where [3 = (3 — 2'yu. Thus / — 7|[ • |p has a strictly supporting hyperplane at u with normal vector 
[/3, —1]. The converse is proved similarly, as is the case in which the supporting hyperplane or paraboloid 
is supporting but not strictly supporting. ■ 
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The first application of Theorem 13.61 is Theorem 15.21 which is formulated for dimension a = I. The 
theorem gives a criterion guaranteeing the existence of a quadratic function g such that s — g is strictly 
concave on doms. The criterion — that s" is bounded above on the interior of doms — is essentially 
optimal for the existence of a fixed quadratic function g guaranteeing the strict concavity of s — g. The 
situation in which s" is not bounded above on the interior of dom s can often be handled by Theorem 15. 5 1 
which is a local version of Theorem l5.2l 

The strict concavity of s — 5 on dom s has several important consequences concerning universal equiv- 
alence of ensembles at the level of equilibrium macrostates and equivalence of ensembles at the thermody- 
namic level — i.e., s'^'^{g, u) = s{u) for all u. As we note in part (e) of Theorem l5.2l the strict concavity of 
s — g also implies that the generalized canonical free energy ipg = {s — g)* is differentiable on M, a con- 
dition guaranteeing the absence of a discontinuous, first-order phase transition with respect to the Gaussian 
ensemble. 

Theorem I5.3l is the analogue of Theorem 15.21 that treats arbitrary dimension a > 2. When o" > 2, in 
general the results are weaker than when a = 1. 

Theorem 5.2. Assume that the dimension a = 1 and that dom s is a nonempty interval. Assume also that 
s is continuous on dom s, s is twice continuously differentiable on int(dom s), and s" is bounded above on 
int(doms). Then for all sufficiently large 7 > and g{u) = 711^, conclusions (a)-(e) hold. Specifically, if 
s is strictly concave on dom s, then we choose any 7 > 0, and otherwise we choose 

7 > 70 = I • sup s"{u). (5.1) 

MGmt(dom s) 

(a) s — g is strictly concave and continuous on dom s. 

(b) s — g has a strictly supporting line, and s has a strictly supporting paraboloid, at all u £ dom s 
except possibly boundary points. At a boundary point s — g has a strictly supporting line, and s has a 
strictly supporting parabola, if and only if the one-sided derivative ofs — g is finite at that boundary point 

(c) The microcanonical ensemble and the Gaussian ensemble defined in terms of this g are universally 
equivalent; i.e., fully equivalent at all u G dom s except possibly boundary points. For all u E int(dom s) the 
value of (3 defining the universally equivalent Gaussian ensemble is unique and is given by j3 = s'{u) — 2ju. 

(d) For all u £ IR, s^^{g, u) = s{u) or equivalently (s — g)**{u) = (s — g){u). 

(e) The generalized canonical free energy (pg = {s — g)* is essentially smooth; in particular, ipg is 
differentiable on M'^. 

Proof, (a) If s is strictly concave on doms, then s{u) — ^v? is also strictly concave on this set for any 
7 > 0. We now consider the case in which s is not strictly concave on dom s. If g{u) = 7^^, then s — g\s, 
continuous on dom s. If, in addition, we choose 7 > 70 in accordance with (I5.lt . then for all u G int(dom s) 

<ys-g)"{u) = s"{u) -27 < 0. 

A straightforward extension of the proof of Theorem 4.4 in J^, in which the inequalities in the first two 
displays are replaced by strict inequalities, shows that —{s — g) is strictly convex on int(doms) and thus 
that s — g is strictly concave on int(dom s). If s — gf is not strictly concave on dom s, then s — g must be 
affine on an interval. Since this violates the strict concavity on int(dom s), part (a) is proved. 

(b) The first assertion follows from part (a) of the present theorem, part (a) of Theorem 13. 61 and Propo- 
sition 15.11 Concerning the second assertion about boundary points, the reader is referred to the discussion 
before Theorem 13. 3 1 

(c) The universal equivalence of the two ensembles is a consequence of part (a) of the present theorem 
and part (b) of Theorem 13.61 The full equivalence of the ensembles at all u G int(dom s) is equivalent to 
the existence of a strictly supporting hyperplane at all u G int(doms) with supergradient (3 [Thm. IS^l a)]. 
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Since s{u) — ^v? is differentiable at all u G int(dom s), part (b) of Theorem lA. 1 1 implies that /? is unique 

and /? = {s(u) - ^i?)' . 

(d) The strict concavity of s — g on dom s proved in part (a) implies that s — gv& concave on M. Part (b) 
of Theorem 14. 1 1 allows us to conclude that for all u ^ TR, s^^i^g^u) = s(?x) or equivalently (s — gY*{u) = 
{s-g){u). 

(e) This follows from part (c) of Theorem 14. II ■ 

We now consider the analogue of Theorem 15 .21 for arbitrary dimension a > 2. In contrast to the case 
0" = 1, in which s — g could always be extended to a strictly concave function on all of dom s, in the case 
a > 2 there exists a quadratic g such that s — g is strictly concave on the interior of dom s, but in general 
s — g cannot be extended to a strictly concave function on all of dom s. One can easily find examples in 
which the boundary of dom s has flat portions and s — g is strictly concave on the interior of dom s and 
constant on these flat portions. As a result, unless dom s is open, we cannot apply part (c) of Theorem 14. II 
to conclude that the generalized canonical free energy ipg = {s — g)* is differentiable on M^. 

Theorem 5.3. Assume that the dimension a > 2 and that doms is convex and has nonempty interior. 
Assume also that s is continuous on dom s, s is twice continuously differentiable on int(dom s), and all 
second-order partial derivatives of s are bounded above on int(dom s). Then for all sufficiently large 7 > 
and g{u) = 7||np, conclusions (a)-(e) hold. Specifically, if s is strictly concave on int(doms), then we 
choose any 7 > 0, and otherwise we choose 

7 > 70 = I • sup (5.2) 

iiSint(dom s) 

where k{u) denotes the largest eigenvalue of the symmetric Hessian matrix of s at u. 
(a) s — g is strictly concave on int(dom s) and concave and continuous on dom s. 

(h) s — g has a strictly supporting hyperplane, and s has a strictly supporting paraboloid, at all u G 
dom s except possibly boundary points. 

(c) The microcanonical ensemble and the Gaussian ensemble defined in terms of this g are universally 
equivalent; i.e., fully equivalent at all u E dom s except possibly boundary points. For all u G int(dom s) 
the value of (3 defining the universally equivalent Gaussian ensemble is unique and is given by (3 = 'Vs(u) — 
2'yu. 

{A) For all u G M", s^^{g,u) = s{u) or equivalently {s — g)**{u) = {s — g){u). 
(e) Assume that dom s is open. Then the generalized canonical free energy ipg = [s — g)* is essentially 
smooth; in particular, (fg is differentiable on IR". 

Proof, (a) If s is strictly concave on int(dom s), then s — 7II • |p is also strictly concave on this set for any 
7 > 0. We now consider the case in which s is not strictly concave on int(doms). If g{u) = 7||u|p, then 
s — gis continuous on dom s. For u G int(dom s), let Qu = {d s{u) / duiduj} denote the Hessian matrix 
of s at u. We choose 7 > 70 in accordance with (15.21) . noting that 

70 = I • sup k{u) (5.3) 

M£int(dom s) 

= \- sup sup{(Q„C,C) :Cg^M1CI1 = !}• 

■uGmt(dom s) 

Let I be the identity matrix. It follows that for any u G int(dom s) and all nonzero z G 

{{Qu-2jl)z,z) <0. 

By analogy with the proof of Theorem 4.5 in |47], the strict concavity of s — 51 on int(dom s) is equivalent 
to the strict concavity of s — 5 on each line segment in int(dom s). This, in turn, is equivalent to the strict 
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concavity, for each v G int(dom s) and nonzero z E , of '4^{\) = {s — g){v + \z) on the open interval 
G{v, z) = {A € iR : w + e int(dom s)}. Since 

^"(A) = ((Q.+A. -27/)^,^) < 0, 

■0' is strictly decreasing on G(f , z). A straightforward extension of the proof of Theorem 4.4 in J^, in 
which the inequalities in the first two displays are replaced by strict inequalities, shows that —-0 is strictly 
convex on G{v, z) and thus that ifj is strictly concave on G{v, z). It follows that s — g is strictly concave on 
int(dom s). By Proposition lA.3l the continuity of s — 5 on dom s allows us to extend the strict concavity of 
s — g on int(dom s) to the concavity of s — g' on dom s. This completes the proof of part (a). 
(b)-(d) These are proved as in Theorem 15. 21 

(e) If dom s is open, then part (a) implies that s—g is strictly concave on dom s. The essential smoothness 
of {s — g)* , and thus its differentiability, are consequences of part (c) of Theorem l4. II ■ 

In the next theorem we give other conditions on s guaranteeing conclusions similar to those in Theorems 
OandO 

Theorem 5.4. Assume that dom s is convex, closed, and bounded and that s is bounded and continuous on 
doms. Then there exists a continuous function g mapping ]R" into M such that the following conclusions 
hold. 

(a) s — g is strictly concave and continuous on dom s, and the generalized canonical free energy ipg = 
{s — g)* is essentially smooth; in particular, ipg is differentiate on ]R". 

(b) s — g has a strictly supporting hyperplane at all u G dom s except possibly relative boundary points. 

(c) The micro canonical ensemble and the generalized canonical ensemble defined in terms of this g are 
universally equivalent on doms; i.e., fully equivalent at all u G doms except possibly relative boundary 
points. 

(d) For all u G M'^, s^'^{g,u) = s{u) or equivalently (s — g)**{u) = (s — g){u). 

Proof, (a) Let h be any strictly concave function on IR'^ . Since h is continuous on ]R'^ Cor. 10.1.1], h 
is also bounded and continuous on dom s . For u G dom s define g{u) = s{u) — h{u). Since g is bounded 
and continuous on the closed set dom s, the Tietze Extension Theorem guarantees that g can be extended to 
a bounded, continuous function on iR*^ fi^, Thm. 4.16]. Then s — g has the properties in part (a). The strict 
concavity of s — 17 on dom s implies the essential smoothness of (s — g)* and thus its differentiability [Thm. 

mtc)]. 

(b) This follows from part (a) of the present theorem and part (a) of Theorem l3.6l 

(c) The universal equivalence of the two ensembles is a consequence of part (a) of the present theorem 
and part (b) of Theorem 13. 61 

(d) The function g constructed in the proof of part (a) is bounded and continuous on iR'^. In addition, 
s — g is strictly concave on doms and thus concave on IR^ . Since s — 5 is continuous on the closed set 
doms, s — 5 is also upper semicontinuous on ]R'^ . Part (b) of Theorem 14. 1 1 implies that for all u G iR'^, 
st)tt(n) = s{u) or equivalently (s — g)**{u) = (s — g){u). ■ 

Suppose that s is on the interior of dom s but the second-order partial derivatives of s are not bounded 
above. This arises, for example, in the Curie-Weiss-Potts model, in which dom s is a closed, bounded 
interval of ]R and s"{u) — > 00 as n approaches the right hand endpoint of doms 1 11]. In such cases one 
cannot expect that the conclusions of Theorems 15 .21 and l5 .31 will be satisfied; in particular, that there exists 
a quadratic function g such that s — g has a strictly supporting hyperplane at each point of the interior of 
dom s and thus that the ensembles are universally equivalent. 

In order to overcome this difficulty, we introduce Theorem 15. 5 1 a local version of Theorems 15.21 and 15.31 
Theorem 15.51 handles the case in which s is on an open set K but either K is not all of int(doms) or 
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K = int(dom s) and the second-order partial derivatives of s are not all bounded above on K. In neither 
of these situations are the hypotheses of Theorem l5.2l or l5.3l satisfied. In Theorem 15.51 additional conditions 
are given guaranteeing that for each u ^ K there exists 7 depending on u such that s — 7II • |p has a 
strictly supporting hyperplane at u. Our strategy is first to choose a paraboloid that is strictly supporting 
in a neighborhood of u and then to adjust 7 so that the paraboloid becomes strictly supporting on all M'^. 
Proposition 15. ll then guarantees that s — 7II • |p has a strictly supporting hyperplane at u. 

This construction for each u ^ K implies a form of universal equivalence of ensembles that is 
weaker than that in Theorems 15.21 and 15.31 but is still useful. In contrast to those theorems, which 
state that = s{u) for all u € , in Theorem 15.51 we prove the alternative representation 

inf^>o ^''^((7^, u) = s{u) for all u in K, where = 7II • |p for 7 > 0. This alternative representation 
is necessitated by the fact that the quadratic depends on u. 

For each fixed u ^ K the value of 7 for which s — 7II • P has a strictly supporting hyperplane at u 
depends on u. However, with the same 7 one might also have a strictly supporting hyperplane at other 
values of u. In general, as one increases 7, the set of u at which s — 7II • |p has a strictly supporting 
hyperplane cannot decrease. Because of part (a) of Theorem 13 .41 this can be restated in terms of ensemble 
equivalence involving the Gaussian ensemble and the corresponding set £{'j)p of equilibrium macrostates 
defined in (II. 5t . Defining 

= {u e K : there exists /? such that f (7)/? = f}, 

we have U^-^ C U^^ whenever 72 > 71 and because of Theorem l5.5[ U7>o ^7 — This phenomenon is 
investigated in detail in 1 12] for the Curie-Weiss-Potts model. 
In order to state Theorem 15. 5 1 we define for n G K and A > 



D{u, Vs(u), X) = \ v £ dom s : s{v) > s(n) + {'Vs{u),v — u) + X\\v — u 




Geometrically, this set contains all points for which the paraboloid with parameters (Vs(n),A) pass- 
ing through {u,s{u)) lies below the graph of s. Clearly, since A > 0, we have D{u,'Vs{u), X) 
C D{u, Vs(ti), 0); the set D{u, Vs(ti), 0) contains all points for which the graph of the hyperplane with 
normal vector [Vs(ii), —1] passing through (n, s(n)) lies below the graph of s. Thus, in the next theorem 
the hypothesis that for each u £ K the set D{u, Vs(n), A) is bounded for some A > is satisfied if dom s 
is bounded or, more generally, if i^(s, Vs(u), 0) is bounded. The latter set is bounded if, for example, —s 
is superlinear; i.e., 

lim = —00. 

>oo 

As we have remarked, the next theorem can often be applied when the hypotheses of Theorem 15.21 or l531 
are not satisfied. 

Theorem 5.5. Let K an open subset of dom s and assume that s is twice continuously dijferentiable on K. 
Assume also that dom s is bounded or, more generally, that for every u £'vcAK there exists A > such that 
D{u, Vs{u), A) is bounded. The following conclusions hold. 

(a) For each u £ K, define 70 (u) > by iS.li . Then for any 7 > 70 (u), s has a strictly supporting 
paraboloid at u with parameters (Vs(ti), 7). 

(b) For each u £ K we choose 7 > 70 (m) os in part (a) and define = 7II • \\^. Then s — has a 
strictly supporting hyperplane at u with normal vector [Vs{u) — 2'yu, —1]. 

(c) For each u £ K 

inf s^^{gj,u) = ini{g^{u) + (s - = s{u). 

7>0 7>0 
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(d) For each u G K choose g = 7|| • |p such that, in accordance with part (b), s — g has a strictly 
supporting hyperplane at u. Then the microcanonical ensemble and the Gaussian ensemble defined in 
terms of this g are fiilly equivalent at u. The value of (3 defining the Gaussian ensemble is unique and is 
given by fj = Vs{u) — 2ju. 

Proof, (a) Given u ^ K, let B{u,r) C be an open ball with center u and positive radius r whose closure 
is contained in K. If the dimension (7 = 1, then s" is bounded above on B{u, r), while if cr > 2, then all 
second-order partial derivatives of s are bounded above on B{u, r). We now apply, to the restriction of s 
to B{u, r), part (a) of Theorem 15.21 when a = 1 and part (a) of Theorem 15. 31 when cr > 2. We conclude 
that there exists a sufficiently large A > such that s — A|| • |p is strictly concave on B{u, r). Part (c) of 
Theorem I A. 41 implies that when restricted to B{u, r), s — A\\ ■ p has a strictly supporting hyperplane at u; 
that is, there exists 9 € IR" such that 

s{v) - A\\v\\^ < s{u) - + {9,v - u) for all v G B{u, r),v / u. (5.4) 

In fact, 9 = Vs(n) — 2Au because s — ^|| • |p is concave and differentiable on B{u, r) rThm. lAJT b)!. We 
rewrite the inequality in the last display as 

s{v) < s{u) + {Vs{u),v — u) + A\\v — u\\'^ for all V ^ B{u,r),v u. (5.5) 

This inequality continues to hold if we take larger values of A, and so without loss of generality we can 
assume that ^ > A. Because s{v) = — oo for v ^ doms, the set where the inequality in the last display 
does not hold is D{u, Vs{u),A). Since yl > A, we have D{u, Vs{u),A) C D{u, Vs{u), A), and since the 
latter set is assumed to be bounded, there exists b € (0, oo) such that 

D{u, Vs(n), A) C{v gM" : \\v - u\\ < b}. (5.6) 

Let 7 be any number satisfying 

7 > 7o(m) = maxj A, ^'^ | . (5.7) 

Since A > 0, it follows that 70 (ii) > 0. We now prove that s has a strictly supporting paraboloid at u with 
parameters (Vs(n),7); i.e., 

s{v) < s{u) + {Vs{u),v - u) +j\\v - n|p for all u G ]R'',v 7^ u. (5.8) 

It suffices to prove ( 15. 8t for all v G dom s. Since > A and since ( 15. 5t is vaUd for all v £ B{u,r), v ^ u, 
(15. St is also valid for all v G B{u, r), v u.ln addition, for all v G dom s \ D{u, Vs(n), A) 

s{v) < s{u) + {'Vs{u),v — u) + A\\v — u\\'^ 
< s{u) + {Vs{u),v — u) + j\\v — u\\'^ , 

and so (15. 8t is also valid for all such v. We finally show that (15. 8t is valid for all v G D{u, Vs{u),A) \ 
B{u, r). This follows from the string of inequalities 

s{u) + (Vs(n), w — u) + j\\v — uW"^ 

> s(n) + {Vs{u),v — u) + 7r^ 

> s(n) - \\Vs{u)\\b - s{u) + ||Vs('u)||6 
= 

> s{v). 
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By proving that (I5.8I) is valid for all v E IR'^, we have completed the proof of part (a). 

(b) This follows from part (a) of the present theorem and Proposition 15. II 

(c) By part (b), for each n € A' and any j > jq, s — has a strictly supporting hyperplane, and thus a 
supporting hyperplane, at u. We now apply to s — part (a) of Theorem lA.4l obtaining (s — gj)**{u) = 
{s-g^){u) or 

s{u) = g^{u) + {s- g^)**{u). 
Since for any 7 > 0, (s — g.y)**{u) > {s — g^){u) rProp. lA.2l . it follows from (14.21) that 

s{u) = inUg^iu) + {s- g^riu)} = mi s^^{g,u). 

7>0 7>0 

(d) Fix u & K and let B{u, r) be an open ball with center u and radius r whose closure is contained 
in K. The full equivalence of the ensembles follows from part (b) of the present theorem and part (a) 
of Theorem 13.41 The value of /? defining the fully equivalent Gaussian ensemble is characterized by the 
property that [/?, —1] is the normal vector to a strictly supporting hyperplane for s — 7II • |p at n. In order 
to identify (3, we consider the convex function h that equals s — 7II • |p on the open ball B{u, r) and equals 
—00 on the complement. Since h is differentiable at u, part (b) of Theorem lA. ll implies that /3 is unique and 
equals S/h{u) = V(s — 7II • This completes the proof. ■ 



Theorem 15.51 suggests an extended form of the notion of universal equivalence of ensembles. In The- 
orems I5.2H5.4I we are able to achieve full equivalence of ensembles for all u € dom s except possibly 
relative boundary points by choosing an appropriate g that is valid for all u. This leads to the observation 
in each theorem that the microcanonical ensemble and the generalized canonical ensemble defined in terms 
of this g are universally equivalent. In Theorem 15. 5 1 we can also achieve full equivalence of ensembles for 
all u G K. However, in contrast to Theorems I5.2H5.4I the choice of g for which the two ensembles are 
fully equivalent depends on u. We summarize the ensemble equivalence property articulated in part (d) of 
Theorem 15. 5 1 bv saying that relative to the set of quadratic functions, the microcanonical ensemble and the 
Gaussian ensembles are universally equivalent on the open set K of mean-energy values. 

We complete our discussion of the generalized canonical ensemble and its equivalence with the micro- 
canonical ensemble by noting that the smoothness hypothesis on s in Theorem 15. 5 1 is essentially satisfied 
whenever the microcanonical ensemble exhibits no phase transition at any u G K. In order to see this, we 
recall that a point Uc at which s is not differentiable represents a first-order, microcanonical phase transition 
|0> Fig- 3]. In addition, a point Uc at which s is differentiable but not twice differentiable represents a 
second-order, microcanonical phase transition 11231 Fig. 4]. It follows that s is smooth on any open set K 
not containing such phase-transition points. Hence, if the other conditions in Theorem 15. 5 1 are valid, then 
the microcanonical and Gaussian ensembles are universally equivalent on K relative to the set of quadratic 
functions. In particular, if the microcanonical ensemble exhibits no phase transitions, then s is smooth 
on all of int(dom s). This implies the universal equivalence of the two ensembles provided that the other 
conditions are valid in Theorem l5.2l if o" = 1 or in Theorem l5.3l if a >2. 



APPENDIX A: MATERIAL ON CONCAVE FUNCTIONS 



This appendix contains a number of technical results on concave functions needed in the main body 
of the paper. The theory of concave functions, rather than that of convex functions, is the natural setting 
for statistical mechanics. This is convincingly illustrated by the main theme of this paper, which is that 
concavity and strict concavity properties of the microcanonical entropy are closely related to the equivalence 
and nonequivalence of the microcanonical and canonical ensembles. 
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Let cj be a positive integer. A function / on IR'^ is said to be concave on IR" , or concave, if — / is a 
proper convex function in the sense of I147L p. 24]; that is, / maps IR^ into ]R U {— oo}, / ^ — oo, and for 
all u and v in IR" and all A e (0, 1) 

/(An + (1 - X)v) > A/(n) + (1 - A)/(n). 

Given / ^ — oo a function mapping IR"^ into iR U {— cxd}, we define dom / to be the set of u G M"^ 
for which /(n) > — oo. Let /3 be a point in M^. The function / is said to have a supporting hyperplane at 
u G dom / with normal vector [/?,—!] if 

f{v) < f(u) + {P,v- u) for all v G M'' . 

It follows from this inequality that u G dom /. In addition, / is said to have a strictly supporting hyperplane 
at u G dom / with normal vector [/?, —1] if the inequality in the last display is strict for all v ^ u. 

Two useful facts for concave functions on are given in the next theorem. They are proved in Theo- 
rems 23.4 and 25.1 in |47]. The quantities appearing in Theorem lA.il are defined after Corollarv l3.2l in the 
present paper. 

Theorem A.l. Let f be a concave function on ]R". The following conclusions hold. 

(a) ri(dom /) C dom df C dom /. 

(b) If f is differentiable at u ^ dom /, then V f{u) is the unique supergradient of f at u. 

Let / ^ — oo be a function mapping ]R^ into IRVJ {— oo}. For (5 and u in ]R^ the Legendre-Fenchel 
transforms /* and /** are defined by |47, p. 308] 

/*(/?) = inf {(/?,n) - f{u)} and /**(n) = inf - /*(/?)}. 

As in the case of convex functions UtI Thm. VI.5.3], /* is concave and upper semicontinuous on IR'^ and 
for all u G IR^ we have f**{u) = f{u) if and only if / is concave and upper semicontinuous on IR^ . 
When / is not concave and upper semicontinuous, the relationship between / and /** is given in the next 
proposition. 

Proposition A.2. Let f ^ —oo be a function mapping IR"^ into iR U { — oo}. If f is not concave and 
upper semicontinuous on ]R", then /** is the smallest concave, upper semicontinuous function on ]R^ that 
satisfies f**{u) > f{u) for all u G M'^. In particular, if for some u, f{u) ^ f**{u), then f{u) < f**{u). 

Proof. For any u and /3 in M'^ we have f{u) < {(3, u) — /*(/?) and thus 

f{u) < mf^^{{(3,u) - f*m = f**{u). 

If ip is any concave, upper semicontinuous function satisfying ip{u) > f{u) for all u, then ip*{P) < /*(/?) 
for all P, and so ip**{u) = ip{u) > f**{u) for all u. ■ 

Let / ^ — oo be a function mapping into IR U {— oo}, u a point in dom /, and K a convex subset 
of dom /. Since /** is concave on M^, the first three of the following four definitions are consistent with 
Proposition lA. 21 / is concave at u if f{u) = /**(n); / is not concave at u if f{u) < /**(n); / is concave 
on K if / is concave at all u ^ K; and / is strictly concave on K if for aWu v in K and all A G (0, 1) 

/(An + (1 - X)v) > Xfiu) + (1 - X)fiv). 



The next proposition gives a useful extension property of strictly concave functions. 
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Proposition A. 3. Assume that dom / is convex and that f is strictly concave on ri(dom /) and continuous 
on dom /. Then f is concave on dom / and on ]R". 

Proof. Any point in dom / \ int(dom/) is the limit of a sequence of points in ri(dom/) ^\ Thm. 6.1]. 
Hence by the continuity of / on dom /, the strict concavity inequality for all n 7^ f in ri(dom /) can be 
extended to a nonstrict inequality for all u and v in dom/. Hence / is convex on dom/. Since / equals 
—00 on the complement of dom /, it also follows that / is convex on IR" . ■ 

Parts (a) and (c) of the next theorem are fundamental in this paper because they relate concavity and 
support properties of functions / on IR'^ . When applied to the microcanonical entropy s and to s — g, where 
is a continuous function defining the generalized canonical ensemble, part (c) of Theorem IA.4I allows us 
to deduce, from strict concavity properties of s and s — g, universal equivalence properties involving the 
canonical ensemble and the generalized canonical ensemble. 

Theorem A.4. Let / ^ —00 be a function mapping M'^ into MU {—00}. The following conclusions hold. 

(a) / has a supporting hyperplane atu ^ dom / with normal vector [f3, —1] if and only if f{u) = f**{u) 
andpe df**{u). 

(b) Assume that f has a supporting hyperplane at u ^ dom / with normal vector [/?, —1]. If f is 
dijferentiable at u and f = /** in a neighborhood ofu, then (3 is unique and (3 = V f{u). 

(c) Assume that dom / is convex and that f is strictly concave on ri(dom /) and continuous on dom /. 
Then f has a strictly supporting hyperplane at all u € dom / except possibly relative boundary points. In 
particular, if dom f is relatively open, then f has a strictly supporting hyperplane at all u E dom /. 

Proof, (a) This is proved in part (a) of Lemma 4. 1 in when f = s. The same proof applies to general 
/■ 

(b) If / has a supporting hyperplane at u G dom / with normal vector [f3, —1], then by part (a), /3 G 
df**{u). If in addition / is differentiable at u and / = /** in a neighborhood of u, then /** is also 
differentiable at u and Vf**{u) = V/(n). The conclusion that f3 is unique and P = Vf{u) then follows 
from part (b) of Theorem lA. ll applied to /**. 

(c) By Proposition IA.3I the assumptions on / guarantee that / is concave on M'^. Since ri(dom/) C 
dom df rThm. lAJT a)]. for any u G ri(dom /) and any (3 G df(u), f has a supporting hyperplane at u with 
normal vector [/?, —1]; i.e., 

f{v) < f{u) + {(3,v- u) for all v G . (A.l) 
If this hyperplane is not a strictly supporting hyperplane, then there exists vq ^ u such that 

f{vo) = f{u) + {p,v^-u). (A.2) 

Thus vq G dom /. We claim that / is strictly concave on ri(dom /) U {vq}. If not, then / must be affine on 
a line segment containing vq. Since this violates the strict concavity of / on ri(dom /), the claim is proved. 
Hence for all A G (0, 1) 

\f{u) + (1 - \)f{vo) < fiXu + (1 - X)vo) for all A G (0, 1). 

Substituting (IA.2t gives 

f{u) + {l-X){P,vo-u)<f{\u+{l-X)vo). (A.3) 

On the other hand, applying (IA.lt to t> = Au + (1 — A)fo, we obtain 

f{\u + il-X)vo) < fiu) + {(3,\u + il-X)vo-u) 
= f{u) + {l-X){(3,vo-u). 
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This contradicts (IA.3t . proving that the supporting hyperplane at u with normal vector [(3, —1] is a strictly 
supporting hyperplane. We have proved that / has a strictly supporting hyperplane at all u S ri(dom /) 
except possibly for relative boundary points. 

If in addition dom / is relatively open, then ri(dom /) = dom /. It follows that in this case / has a 
strictly supporting hyperplane at all u G dom /. This completes the proof of part (b). ■ 

The next result is applied in Theorem 14.21 which relates ensemble equivalence at the thermodynamic 
level and at the level of equilibrium macrostates. Given / ^ — oo a function mapping M'^ into M U {— oo}, 
we define 

C(/) = {u e M'' : 3f3 e M'' B f{v) < f{u) + {p,v - u) Mv G IR"} (A.4) 

and 

r(/) = {nGiR'^:/(n) = r*(n)}. (A.5) 

C{f) consists of all u G JR" such that / has a supporting hyperplane at u, and so if u G C{f), then 
dom df{u) / 0. In addition, u G r(/) n dom / if and only if / is concave at u. 

Theorem A.5. Let f ^ —oo be a function mapping ]R" into MU {— oo}. The following conclusions hold. 

(a) C (/) = r(/) n dom df**. In particular, if f is concave on M^, then C (/) = dom df, and so f has 
a supporting hyperplane at all u G dom / except possibly relative boundary points. 

(b) r(/) n ri(dom /) c C{f) c r(/) n dom /. 

(c) Except possibly for relative boundary points o/dom /, / has no supporting hyperplane atu^ dom / 
if and only if f is not concave at u. 

Proof, (a) The assertion that C(/) = r(/) n dom df** is a consequence of part (a) of Theorem IA.4I Now 
assume that / is concave on M'^. Then, since / = /**, it follows that r(/) = M"^, dome?/** = dom 5/, 
and thus C{f) = dom 3/. Part (a) of Theorem lA. ll implies that / has a supporting hyperplane at all points 
in dom / except possibly relative boundary points. 

(b) If n G r(/) n ri(dom/), then f{u) = f**{u) and u G ri(dom/**), which in turn is a subset of 
dom a/** rThm.lAlTa)1. Hence r(/) nri(dom /) C r(/) n dom 9/**, which by part (a) equals C{f ). This 
proves the first inclusion in part (b). To prove the second inclusion, we note that by part (a) C(/) C r(/) 
and that for all ueC, f{u) > -oo. Thus C{f ) C r(/) n dom /. 

(c) If / has no supporting hyperplane at n G ri(dom /), then u ^ C{f), and so by the first inclusion in 
part (b) / r(/); i.e., / is not concave at u. Conversely, if / is not concave at u G dom /, then u r(/), 
and so by the second inclusion in part (b) u C(/); i.e., / has no supporting hyperplane at u. ■ 
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